AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
GUI Visual Positioning

# GUI Visual Positioning

GUI Actor 7B Qwen2 VL
MIT
GUI-Actor-7B is a vision-language model developed based on Qwen2-VL-7B-Instruct, focusing on graphical user interface (GUI) agent tasks and providing a coordinate-free visual grounding solution.
Multimodal Fusion Transformers
G
microsoft
207
14
Uground V1 2B
Apache-2.0
UGround is a powerful GUI visual positioning model trained using a simple method, jointly developed by OSUNLP and Orby AI.
Multimodal Fusion Transformers English
U
osunlp
975
8
Uground
UGround is a powerful GUI visual positioning model trained with a streamlined recipe, developed by the Ohio State University NLP Group in collaboration with Orby AI.
Image-to-Text
U
osunlp
208
23
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase